forked from xapi-project/sm
-
Notifications
You must be signed in to change notification settings - Fork 8
Fix GC hidden attribute in case of SIGTERM signal #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The GC can be interrupted by a SIGTERM signal. If this is caught while modifying
a volume's hidden flag, this can have bad consequences.
For example in the situation below, the hidden flag of a volume has been changed
but the cached value (self.hidden) in the python process still has the old value
because of the 'util.CommandException' exception that was thrown. A VDI
that normally should not be hidden is still hidden after executing
`_undoInterruptedCoalesceLeaf` because the hidden value was not the correct one.
Code:
```
def _setHidden(self, hidden=True):
vhdutil.setHidden(self.path, hidden)
# Exception! Next line is never executed.
self.hidden = hidden
```
Trace:
```
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-parent from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-blocks from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] GC: recieved SIGTERM
Jun 5 09:15:50 r620-q6 SM: [563219] FAILED in util.pread: (rc -15) stdout: '', stderr: ''
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] * E X C E P T I O N *
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] _doCoalesceLeaf: EXCEPTION <class 'util.CommandException'>, Signalled 15
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2653, in _liveLeafCoalesce
Jun 5 09:15:50 r620-q6 SMGC: [563219] self._doCoalesceLeaf(vdi)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2717, in _doCoalesceLeaf
Jun 5 09:15:50 r620-q6 SMGC: [563219] vdi._setHidden(True)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 1063, in _setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] vhdutil.setHidden(self.path, hidden)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 235, in setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] ret = ioretry(cmd)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] errlist=[errno.EIO, errno.EAGAIN])
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 347, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] return f()
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jun 5 09:15:50 r620-q6 SMGC: [563219] return util.ioretry(lambda: util.pread2(cmd, text=text),
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 255, in pread2
Jun 5 09:15:50 r620-q6 SMGC: [563219] return pread(cmdlist, quiet=quiet, text=text)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 217, in pread
Jun 5 09:15:50 r620-q6 SMGC: [563219] raise CommandException(rc, str(cmdlist), stderr.strip())
Jun 5 09:15:50 r620-q6 SMGC: [563219]
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** UNDO LEAF-COALESCE
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming parent back: dce4b0fc-6ad1-4750-857b-45d8d2758503 -> 056b6f93-66ff-460a-9354-157540b584a8
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming child back to dce4b0fc-6ad1-4750-857b-45d8d2758503
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Updating the VDI record
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vhd-parent = 056b6f93-66ff-460a-9354-157540b584a8 for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vdi_type = vhd for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] pread SUCCESS
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** leaf-coalesce undo successful
```
Therefore, a VDI impacted by this problem remains hidden and can no longer
be used correctly without manual intervention:
```
Jun 5 09:16:29 r620-q6 SM: [566174] lock: released /var/lock/sm/f816795d-e7a9-43df-170c-23bc329607fc/sr
Jun 5 09:16:29 r620-q6 SM: [566174] ***** generic exception: vdi_clone: EXCEPTION <class 'xs_errors.SROSError'>, Failed to clone VDI [opterr=hidden VDI]
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 113, in run
Jun 5 09:16:29 r620-q6 SM: [566174] return self._run_locked(sr)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 163, in _run_locked
Jun 5 09:16:29 r620-q6 SM: [566174] rv = self._run(sr, target)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 270, in _run
Jun 5 09:16:29 r620-q6 SM: [566174] return target.clone(self.params['sr_uuid'], self.vdi_uuid)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 704, in clone
Jun 5 09:16:29 r620-q6 SM: [566174] return self._do_snapshot(sr_uuid, vdi_uuid, VDI.SNAPSHOT_DOUBLE)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 754, in _do_snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] return self._snapshot(snapType, cbtlog, consistency_state)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 797, in _snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] raise xs_errors.XenError('VDIClone', opterr='hidden VDI')
Jun 5 09:16:29 r620-q6 SM: [566174]
```
Signed-off-by: Ronan Abhamon <[email protected]>
|
Is this a candidate for an upstream PR? |
Nambrok
approved these changes
Jun 11, 2025
|
Upstream PR: xapi-project#760 |
Wescoeur
added a commit
that referenced
this pull request
Aug 28, 2025
The GC can be interrupted by a SIGTERM signal. If this is caught while modifying
a volume's hidden flag, this can have bad consequences.
For example in the situation below, the hidden flag of a volume has been changed
but the cached value (self.hidden) in the python process still has the old value
because of the 'util.CommandException' exception that was thrown. A VDI
that normally should not be hidden is still hidden after executing
`_undoInterruptedCoalesceLeaf` because the hidden value was not the correct one.
Code:
```
def _setHidden(self, hidden=True):
vhdutil.setHidden(self.path, hidden)
# Exception! Next line is never executed.
self.hidden = hidden
```
Trace:
```
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-parent from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-blocks from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] GC: recieved SIGTERM
Jun 5 09:15:50 r620-q6 SM: [563219] FAILED in util.pread: (rc -15) stdout: '', stderr: ''
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] * E X C E P T I O N *
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] _doCoalesceLeaf: EXCEPTION <class 'util.CommandException'>, Signalled 15
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2653, in _liveLeafCoalesce
Jun 5 09:15:50 r620-q6 SMGC: [563219] self._doCoalesceLeaf(vdi)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2717, in _doCoalesceLeaf
Jun 5 09:15:50 r620-q6 SMGC: [563219] vdi._setHidden(True)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 1063, in _setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] vhdutil.setHidden(self.path, hidden)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 235, in setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] ret = ioretry(cmd)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] errlist=[errno.EIO, errno.EAGAIN])
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 347, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] return f()
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jun 5 09:15:50 r620-q6 SMGC: [563219] return util.ioretry(lambda: util.pread2(cmd, text=text),
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 255, in pread2
Jun 5 09:15:50 r620-q6 SMGC: [563219] return pread(cmdlist, quiet=quiet, text=text)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 217, in pread
Jun 5 09:15:50 r620-q6 SMGC: [563219] raise CommandException(rc, str(cmdlist), stderr.strip())
Jun 5 09:15:50 r620-q6 SMGC: [563219]
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** UNDO LEAF-COALESCE
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming parent back: dce4b0fc-6ad1-4750-857b-45d8d2758503 -> 056b6f93-66ff-460a-9354-157540b584a8
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming child back to dce4b0fc-6ad1-4750-857b-45d8d2758503
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Updating the VDI record
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vhd-parent = 056b6f93-66ff-460a-9354-157540b584a8 for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vdi_type = vhd for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] pread SUCCESS
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** leaf-coalesce undo successful
```
Therefore, a VDI impacted by this problem remains hidden and can no longer
be used correctly without manual intervention:
```
Jun 5 09:16:29 r620-q6 SM: [566174] lock: released /var/lock/sm/f816795d-e7a9-43df-170c-23bc329607fc/sr
Jun 5 09:16:29 r620-q6 SM: [566174] ***** generic exception: vdi_clone: EXCEPTION <class 'xs_errors.SROSError'>, Failed to clone VDI [opterr=hidden VDI]
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 113, in run
Jun 5 09:16:29 r620-q6 SM: [566174] return self._run_locked(sr)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 163, in _run_locked
Jun 5 09:16:29 r620-q6 SM: [566174] rv = self._run(sr, target)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 270, in _run
Jun 5 09:16:29 r620-q6 SM: [566174] return target.clone(self.params['sr_uuid'], self.vdi_uuid)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 704, in clone
Jun 5 09:16:29 r620-q6 SM: [566174] return self._do_snapshot(sr_uuid, vdi_uuid, VDI.SNAPSHOT_DOUBLE)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 754, in _do_snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] return self._snapshot(snapType, cbtlog, consistency_state)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 797, in _snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] raise xs_errors.XenError('VDIClone', opterr='hidden VDI')
Jun 5 09:16:29 r620-q6 SM: [566174]
```
Signed-off-by: Ronan Abhamon <[email protected]>
Wescoeur
added a commit
that referenced
this pull request
Aug 28, 2025
The GC can be interrupted by a SIGTERM signal. If this is caught while modifying
a volume's hidden flag, this can have bad consequences.
For example in the situation below, the hidden flag of a volume has been changed
but the cached value (self.hidden) in the python process still has the old value
because of the 'util.CommandException' exception that was thrown. A VDI
that normally should not be hidden is still hidden after executing
`_undoInterruptedCoalesceLeaf` because the hidden value was not the correct one.
Code:
```
def _setHidden(self, hidden=True):
vhdutil.setHidden(self.path, hidden)
# Exception! Next line is never executed.
self.hidden = hidden
```
Trace:
```
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-parent from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-blocks from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] GC: recieved SIGTERM
Jun 5 09:15:50 r620-q6 SM: [563219] FAILED in util.pread: (rc -15) stdout: '', stderr: ''
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] * E X C E P T I O N *
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] _doCoalesceLeaf: EXCEPTION <class 'util.CommandException'>, Signalled 15
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2653, in _liveLeafCoalesce
Jun 5 09:15:50 r620-q6 SMGC: [563219] self._doCoalesceLeaf(vdi)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2717, in _doCoalesceLeaf
Jun 5 09:15:50 r620-q6 SMGC: [563219] vdi._setHidden(True)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 1063, in _setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] vhdutil.setHidden(self.path, hidden)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 235, in setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] ret = ioretry(cmd)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] errlist=[errno.EIO, errno.EAGAIN])
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 347, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] return f()
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jun 5 09:15:50 r620-q6 SMGC: [563219] return util.ioretry(lambda: util.pread2(cmd, text=text),
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 255, in pread2
Jun 5 09:15:50 r620-q6 SMGC: [563219] return pread(cmdlist, quiet=quiet, text=text)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 217, in pread
Jun 5 09:15:50 r620-q6 SMGC: [563219] raise CommandException(rc, str(cmdlist), stderr.strip())
Jun 5 09:15:50 r620-q6 SMGC: [563219]
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** UNDO LEAF-COALESCE
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming parent back: dce4b0fc-6ad1-4750-857b-45d8d2758503 -> 056b6f93-66ff-460a-9354-157540b584a8
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming child back to dce4b0fc-6ad1-4750-857b-45d8d2758503
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Updating the VDI record
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vhd-parent = 056b6f93-66ff-460a-9354-157540b584a8 for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vdi_type = vhd for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] pread SUCCESS
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** leaf-coalesce undo successful
```
Therefore, a VDI impacted by this problem remains hidden and can no longer
be used correctly without manual intervention:
```
Jun 5 09:16:29 r620-q6 SM: [566174] lock: released /var/lock/sm/f816795d-e7a9-43df-170c-23bc329607fc/sr
Jun 5 09:16:29 r620-q6 SM: [566174] ***** generic exception: vdi_clone: EXCEPTION <class 'xs_errors.SROSError'>, Failed to clone VDI [opterr=hidden VDI]
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 113, in run
Jun 5 09:16:29 r620-q6 SM: [566174] return self._run_locked(sr)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 163, in _run_locked
Jun 5 09:16:29 r620-q6 SM: [566174] rv = self._run(sr, target)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 270, in _run
Jun 5 09:16:29 r620-q6 SM: [566174] return target.clone(self.params['sr_uuid'], self.vdi_uuid)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 704, in clone
Jun 5 09:16:29 r620-q6 SM: [566174] return self._do_snapshot(sr_uuid, vdi_uuid, VDI.SNAPSHOT_DOUBLE)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 754, in _do_snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] return self._snapshot(snapType, cbtlog, consistency_state)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 797, in _snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] raise xs_errors.XenError('VDIClone', opterr='hidden VDI')
Jun 5 09:16:29 r620-q6 SM: [566174]
```
Signed-off-by: Ronan Abhamon <[email protected]>
Wescoeur
added a commit
that referenced
this pull request
Aug 28, 2025
The GC can be interrupted by a SIGTERM signal. If this is caught while modifying
a volume's hidden flag, this can have bad consequences.
For example in the situation below, the hidden flag of a volume has been changed
but the cached value (self.hidden) in the python process still has the old value
because of the 'util.CommandException' exception that was thrown. A VDI
that normally should not be hidden is still hidden after executing
`_undoInterruptedCoalesceLeaf` because the hidden value was not the correct one.
Code:
```
def _setHidden(self, hidden=True):
vhdutil.setHidden(self.path, hidden)
# Exception! Next line is never executed.
self.hidden = hidden
```
Trace:
```
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-parent from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Removed vhd-blocks from dce4b0fc(2.000G/170.336M?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] GC: recieved SIGTERM
Jun 5 09:15:50 r620-q6 SM: [563219] FAILED in util.pread: (rc -15) stdout: '', stderr: ''
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] * E X C E P T I O N *
Jun 5 09:15:50 r620-q6 SMGC: [563219] ***********************
Jun 5 09:15:50 r620-q6 SMGC: [563219] _doCoalesceLeaf: EXCEPTION <class 'util.CommandException'>, Signalled 15
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2653, in _liveLeafCoalesce
Jun 5 09:15:50 r620-q6 SMGC: [563219] self._doCoalesceLeaf(vdi)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 2717, in _doCoalesceLeaf
Jun 5 09:15:50 r620-q6 SMGC: [563219] vdi._setHidden(True)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/cleanup.py", line 1063, in _setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] vhdutil.setHidden(self.path, hidden)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 235, in setHidden
Jun 5 09:15:50 r620-q6 SMGC: [563219] ret = ioretry(cmd)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] errlist=[errno.EIO, errno.EAGAIN])
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 347, in ioretry
Jun 5 09:15:50 r620-q6 SMGC: [563219] return f()
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jun 5 09:15:50 r620-q6 SMGC: [563219] return util.ioretry(lambda: util.pread2(cmd, text=text),
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 255, in pread2
Jun 5 09:15:50 r620-q6 SMGC: [563219] return pread(cmdlist, quiet=quiet, text=text)
Jun 5 09:15:50 r620-q6 SMGC: [563219] File "/opt/xensource/sm/util.py", line 217, in pread
Jun 5 09:15:50 r620-q6 SMGC: [563219] raise CommandException(rc, str(cmdlist), stderr.strip())
Jun 5 09:15:50 r620-q6 SMGC: [563219]
Jun 5 09:15:50 r620-q6 SMGC: [563219] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** UNDO LEAF-COALESCE
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming parent back: dce4b0fc-6ad1-4750-857b-45d8d2758503 -> 056b6f93-66ff-460a-9354-157540b584a8
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming child back to dce4b0fc-6ad1-4750-857b-45d8d2758503
Jun 5 09:15:50 r620-q6 SMGC: [563219] Renaming /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/OLD_dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd -> /var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/dce4b0fc-6ad1-4750-857b-45d8d2758503.vhd
Jun 5 09:15:50 r620-q6 SMGC: [563219] Updating the VDI record
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vhd-parent = 056b6f93-66ff-460a-9354-157540b584a8 for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SMGC: [563219] Set vdi_type = vhd for dce4b0fc(2.000G/8.500K?)
Jun 5 09:15:50 r620-q6 SM: [563219] ['/usr/bin/vhd-util', 'set', '--debug', '-n', '/var/run/sr-mount/f816795d-e7a9-43df-170c-23bc329607fc/056b6f93-66ff-460a-9354-157540b584a8.vhd', '-f', 'hidden', '-v', '1']
Jun 5 09:15:50 r620-q6 SM: [563219] pread SUCCESS
Jun 5 09:15:50 r620-q6 SMGC: [563219] *** leaf-coalesce undo successful
```
Therefore, a VDI impacted by this problem remains hidden and can no longer
be used correctly without manual intervention:
```
Jun 5 09:16:29 r620-q6 SM: [566174] lock: released /var/lock/sm/f816795d-e7a9-43df-170c-23bc329607fc/sr
Jun 5 09:16:29 r620-q6 SM: [566174] ***** generic exception: vdi_clone: EXCEPTION <class 'xs_errors.SROSError'>, Failed to clone VDI [opterr=hidden VDI]
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 113, in run
Jun 5 09:16:29 r620-q6 SM: [566174] return self._run_locked(sr)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 163, in _run_locked
Jun 5 09:16:29 r620-q6 SM: [566174] rv = self._run(sr, target)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/SRCommand.py", line 270, in _run
Jun 5 09:16:29 r620-q6 SM: [566174] return target.clone(self.params['sr_uuid'], self.vdi_uuid)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 704, in clone
Jun 5 09:16:29 r620-q6 SM: [566174] return self._do_snapshot(sr_uuid, vdi_uuid, VDI.SNAPSHOT_DOUBLE)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 754, in _do_snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] return self._snapshot(snapType, cbtlog, consistency_state)
Jun 5 09:16:29 r620-q6 SM: [566174] File "/opt/xensource/sm/FileSR.py", line 797, in _snapshot
Jun 5 09:16:29 r620-q6 SM: [566174] raise xs_errors.XenError('VDIClone', opterr='hidden VDI')
Jun 5 09:16:29 r620-q6 SM: [566174]
```
Signed-off-by: Ronan Abhamon <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The GC can be interrupted by a SIGTERM signal. If this is caught while modifying a volume's hidden flag, this can have bad consequences.
For example in the situation below, the hidden flag of a volume has been changed but the cached value (self.hidden) in the python process still has the old value because of the 'util.CommandException' exception that was thrown. A VDI that normally should not be hidden is still hidden after executing
_undoInterruptedCoalesceLeafbecause the hidden value was not the correct one.Code:
Trace:
Therefore, a VDI impacted by this problem remains hidden and can no longer be used correctly without manual intervention: